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(54) Robust digital watermarking 

(57) A watermarking procedure that is applicable to 
images, audio, video and multimedia data to be water- 
marked divides the data to be watermarked into a set of 
nxn blocks, such as the 8x8 blocks of MPEG. The same 
watermark signal can be distributed throughout the set 
of blocks in a large variety of ways. This allows the 
insertion algorithm to be changed without affecting the 
decoders. The decoding procedure first sums together 
the DCT coefficients of N sets of 8x8 blocks to form a 
set of N summed 8x8 blocks and then extracts the 
watermark from the summed block. Since the sum of 
the DCT blocks is equal to the DCT of the sum of the 
intensity blocks, efficient decoding can occur in both the 
spatial and frequency domains. The symmetric nature 
of the decoding process allows geometric distortions to 
be handled in the spatial domain and other signal distor- 
tions to be handled in the frequency domain. Moreover, 
insertion of a watermark signal into image data and the 
subsequent extraction of the watermark from water- 
marked image data which has been subject to distortion 
between the times of insertion and extraction involves 
the insertion of multiple watermarks designed to survive 
predefined distortions of the image data, such as pans- 
can or letterbox mode transformations. Alternatively, a 
registration pattern in the image data, after the image 
data containing the registration pattern is subject to an 
unknown distortion, is used to compensate for distortion 
of the watermarked image data. 
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Description 

[0001 ] The present invention relates to digital watermarking of data including image, video and multimedia data. Spe- 
cifically, the invention relates to insertion and detection or extraction of embedded signals for purposes of watermarking, 

5 in which the insertion and detection procedures are applied to sums of subregions of the data. When these subregions 
correspond to the 8x8 pixel blocks used for MPEG (moving picture experts group) and JPEG Qoint photograph experts 
group) compression and decompression, the watermarking procedure can be tightly coupled with these compression 
algorithms to achieve very significant savings in computation. The invention also relates to the insertion and detection 
of embedded signals for the purposes of watermarking, in which the watermarked data might have undergone distortion 

10 between the times of insertion and detection of the watermark. 

2. Description of the Prior Art: 

[0002] The proliferation of digitized media such as image, video and multimedia is creating a need for a security sys- 
15 tern that facilitates the identification of the source of the material. 

[0003] Content providers, i.e. owners of works in digital data form, have a need to embed signals into 
video/image/multimedia data, which can subsequently be detected by software, and/or hardware devices for purposes 
of authentication of copyright ownership, and copy control and management. 

[0004] For example, a coded signal might be inserted in data to indicate that the data should not be copied. The 
20 embedded signal should preserve the image fidelity (image quality), be robust to common signal transformations and 
resistant to tampering. In addition, consideration must be given to the data rate that can be provided by the system, 
though current requirements are relatively low -a few bits per frame. 

[0005] In U.S. Patent Application Serial No. 08/534,894, filed September 28, 1995, entitled "Secure Spread Spectrum 
Watermarking for Multimedia Data", which is incorporated herein by reference, there was proposed a spread spectrum 

25 watermarking method which embedded a watermark signal into perceptually significant regions of an image for the pur- 
poses of identifying the content owner and/or possessor. A strength of this approach is that the watermark is very diffi- 
cult to remove. In fact, this method only allows the watermark to be read if the original image or data is available for 
comparison. This is because the original spectrum of the watermark is shaped to that of the image through a non-linear 
multiplicative procedure, and this spectral shaping must be removed prior to detection by matched filtering. In addition, 

30 the watermark is usually inserted into the N largest spectral coefficients, the ranking of which is not preserved after 
watermarking. This method does not allow software and hardware devices to directly read embedded signals without 
access to the original unwatermarked material. 

[0006] In an article by Cox et al., entitled "Secured Spectrum Watermarking for Multimedia", available through the 
Internet at rrttp^/www.neci.nj.nec.cx)m/tr/index.html (Technical Report No. 95-10) spread spectrum watermarking is 

35 described which embeds a pseudo-random noise sequence into the digital data for watermarking purposes. 

[0007] The above prior art watermark extraction methodology requires the original image spectrum be subtracted 
from the watermark image spectrum. This restricts the use of the method when there is no original image or original 
image spectrum available to the decoder. One application where this presents a significant difficulty is for third party 
device providers desiring to read embedded information for operation or denying operation of such a device. 

40 [0008] In U.S. Patent No. 5,319,735 by R. D. Preuss et al. entitled "Embedded Signaling" digital information is 
encoded to produce a sequence of code symbols. The sequence of code symbols is embedded in an audio signal by 
generating a corresponding sequence of spread spectrum code signals representing the sequence of code symbols. 
The frequency components of the code signal being essentially confined to a preselected signaling band lying within 
the bandwidth of the audio signal and successive segments of the code signal corresponds to successive code symbols 

45 in the sequence. The audio signal is continuously frequency analyzed over a frequency band encompassing the sign- 
aling band and the code signal is dynamically filtered as a function of the analysis to provide a modified code signal with 
frequency component levels which are, at each time instant, essentially a preselected proportion of the levels of the 
audio signal frequency components in corresponding frequency ranges. The modified code signal and the audio signal 
are combined to provide a composite audio signal in which the digital information is embedded. This component audio 

so signal is then recorded on a recording medium or is otherwise subjected to a transmission channel. Two key elements 
of this process are the spectral shaping and spectral equalization that occur at the insertion and extraction stages, 
respectively, thereby allowing the embedded signal to be extracted without access to the unwatermarked original data. 
[0009] In U.S. Patent No. 5,848,155 by Cox entitled "A Spread Spectrum Watermark for Embedded Signaling" and 
incorporated herein by reference, there is described a method for extracting a watermark of embedded data from water- 

55 marked images or video without using an original or unwatermarked version of the data. 

[001 0] This method of watermarking an image or image data for embedded signaling requires that the DCT (discrete 
cosine transform) and its inverse (I DCT) of the entire image be computed. There are fast algorithms for computing the 
DCT in N log N time, where N is the number of pixels in the image. However, for N=51 2x512, the computational require- 
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ment is still high, particularly if the encoding and extracting processes must occur at video rates, i.e. 30 frames per sec- 
ond. This method requires approximately 30 times the computation needed for MP EG- 1 1 decompression. 
[001 1 ] One possible way to achieve real-time video watermarking is to only watermark every A/-th frame. However, 
content owners wish to protect each and every video frame. Moreover, if it is known which frames contain embedded 

5 signals, it is simple to remove those frames with no noticeable degradation in the video signal. 

[0012] An alternative option is to insert the watermark into nxn blocks of the image (subi mages) where n«N. If the 
block size is chosen to be 8x8, i.e. the same size as that used for MPEG image compression, then it is possible to tightly 
couple the watermark insertion and extraction procedures to those of the MPEG compression and decompression algo- 
rithms. Considerable computational saving can then be achieved since the most expensive computations relate to the 

10 calculation of the DOT and its inverse and these steps are already computed as part of the compression and decom- 
pression algorithm. The incremental cost of watermarking is then very small, typically less than five percent of the com- 
putational requirements associated with MPEG. 

[001 3] U.S. Patent Application Serial No.08/71 5,953, filed September 1 9, 1 996, entitled "Watermarking of Image Data 
Using MPEG/JPEG Coefficients" which is incorporated herein by reference, advances this work by using MPEG/JPEG 
is coefficients to encode the image data. 

[0014] U.S. Patent Application Serial No. 08/746, 022, filed November 5, 1996, entitled "Digital Watermarking", which 
is incorporated herein by references, describes storing watermark information into subimages and extracting water- 
mark information from subimages. 

[001 5] A review of watermarking is found in an article by Cox et al., entitled "A review of watermarking and the irrtpor- 

20 tance of perceptual modeling" in Proc. of El'97, vol. 30-16, February 9-14, 1997. 

[001 6] There have been several proposals to watermark MPEG video or JPEG compressed still images. In all cases, 
each 8x8 DCT block is modified to contain the watermark or a portion thereof. Consequently, decoding of the water- 
mark requires that each 8x8 block be individually analyzed to extract the watermark signal contained therein. The indi- 
vidual extracted signals may then be combined to form a composite watermark, which is then compared with known 

25 watermarks. Because each block must be analyzed individually, an uncompressed image must be converted back to 
the block-based DCT representation, which is computationally expensive. Thus, while the decoder may be computa- 
tionally efficient in the DCT domain, extracting a watermark from the spatial domain is much more expensive. 
[001 7] To allow for computationally efficient detection of the watermark in both the spatial and DCT domains, a water- 
mark may be inserted in the sum of all the 8x8 blocks in the DCT domain, or the sum of a subset of all the 8x8 blocks 

30 in the DCT domain. A major advantage of this approach is that if the image is only available in the spatial domain, then 
the summation can also be performed in the spatial domain to compute a small set of summed 8x8 blocks and only 
those blocks must then be transformed into the DCT domain. This is because the sum of the DCT blocks is equal to the 
DCT of the sum of the intensities. Thus, the computational cost of decoding in the DCT and spatial domains is approx- 
imately the same. 

35 [001 8] A second advantage of watermarking the sum of the DCT blocks is that there are an unlimited number of equiv- 
alent methods to apportion the watermark throughout the image. For example, if the watermark requires a change of Ai 
to the Mh coefficient of the summed DCT block, then, if there are M blocks in the image, Ai/M can be added to each 
individual block, or block 1 can have Ai added to K and the remaining M-1 blocks left unaltered, ignoring for the moment 
issues of image fidelity. Because of this one to many mapping, it is possible to alter the insertion algorithm without 

40 changing the decoder. This is a very important characteristic, since in some watermarking applications, there may be 
many hardware decoders that are deployed, such that changing the decoder is impractical. However, improvements to 
the insertion algorithm can still result in improved detection using the approach described herein. 
[001 9] A third advantage of watermarking the sum of the DCT blocks is that watermark signals extracted from these 
sums have small variances, compared with the amount that they may be changed without causing fidelity problems. 

45 This means that, in many cases, it is possible to change an image so that the summed DCT blocks perfectly match the 
required watermark signal, even though the resulting image appears identical to the original. 
[0020] Finally, it is well known that some problems, such as modeling the human visual system, are best performed 
in the frequency domain, where other problems such as geometric transformations are more conveniently dealt with in 
the spatial domain. Since the computational cost of decoding the watermark is now symmetric, it is possible to switch 

so from spatial to frequency domains at will in order to correct for various signal transformations that may corrupt the 
watermark. 

[0021 ] The present invention concerns a novel insertion method which employs a specific model of the human visual 
system which provides much better control over image fidelity. Tests have shown that it is possible to obtain large sig- 
nals (more than 15 standard deviations from 0 correlation) with images that are indistinguishable from their respective 
55 original images. 

[0022] The method handles robustness against various types of attacks in ways that are easy to relate to the specific 
type of attack. 

[0023] The method is adaptable so that the model of the human visual system and the techniques used for handling 
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attacks can be changed later without having to change the detector. The result is that it is possible to continue improving 
watermarking, particularly DVD (digital video disk) watermarking, even after many detectors have been installed. This 
is analogous to the situation with MPEG video for which encoder technology can be improved without having to change 
existing decoders. 

5 [0024] Use of the present insertion method allows a simple detection algorithm in either MPEG or decompressed 
domains. 

[0025] The invention also concerns a novel detection method which is easy to implement, easy to analyze and has a 
low computational cost, whether the incoming video is MPEG compressed or uncompressed. 
[0026] The present invention also concerns a novel insertion method that hides multiple patterns in the data. These 
10 patterns fall into two categories: 1) registration patterns used during detection to compensate for translational shifts, 
and 2) watermark patterns that encode the information content of the watermark. 

[0027] A principal object of the present invention is the provision of a digital watermark insertion method which allows 
detection of watermarks after the watermarked data is subjected to predefined scale changes, without modification to 
the watermark detector. 

75 [0028] Another object of the invention is the provision of a watermark detection method that is computationally inex- 
pensive in either the MPEG or decompressed domains. 

[0029] A still other object of the invention is the provision of a digital watermarking method that withstands attacks 
without having to change a detector. 

[0030] Further and still other objects of the invention will become more clearly apparent when the following description 
20 is read in conjunction with the accompanying drawings. 

Brief Description of the Drawings 

[0031] 

25 

Rgure 1 is a genera! data flow diagram of a method of inserting a watermark into media data; 
Figure 2 is a schematic diagram of an MPEG-2 encoder; 

Rgure 3 is a schematic diagram of a modified MPEG-2 encoder for reducing degradation of a watermark in water- 
marked data; 

30 Rgure 4 is a schematic diagram of an alternative modified MPEG-2 encoder for reducing degradation of a water- 
mark in watermarked data; 

Figure 5 is a flow chart of the process performed in the watermark correction device in Rgures 3 and 4; 
Figure 6 is a flow chart of the process performed in step 506 of Rgure 5; 
Rgure 7 is a flow chart of an alternative process performed in step 506 of Rgure 5; 
35 Figure 8 is a flow diagram of a method of extracting a watermark from MPEG media data; 

Figure 9 is a flow diagram of a method of extracting a watermark from uncompressed media; 

Rgure 10 is a flow diagram of a method of detecting a watermark from MPEG data, with registration; and 

Figure 1 1 is a flow diagram of a method of detecting a watermark from uncompressed image data, with registration. 

40 Detailed Description of the Preferred Embodiment 

[0032] As used in the following description the terms image and image data will be understood to be equally applica- 
ble to video, image and multimedia data. The term "watermark" will be understood to include embedded data, symbols, 
images, instructions or any other identifying information. 

45 [0033] In order to better understand the present invention, first a review of the basic watermarking method will be pre- 
sented followed by additional descriptions of the improvements comprising the present invention. 
[0034] First, we define some notations. Let a watermark to be embedded into an image be an N dimensional vector, 

denoted by W[1 N]. In the following text, the notation W[1,...,N] is used in the same manner as W[k] (k=1,...,N). Let 

V[1,.„ ( N] denote a vector value extracted from an image, where the element V[k] corresponds to W[k]. Specifically, the 

so value V[k] is a weighted sum of DCT coefficients given by 

VTk]=D_K1]F_k[1]+D_l<2]F_k[2]+...+D_l<n_k]F_k[n_k], 

where D_k[i] (i=1 I ...,n_k) indicate members of the set of DCT coefficients used for calculating V[k], n_k indicates the 
55 number of members, and F_kp] (i=1 ,...,n_k) are weighting coefficients related to a filter processing. The concept of F_k 
is that the DCT coefficients are weighted according to how much noise might be expected in each coefficient. To calcu- 
late V[1,...,N], nxn DCT coefficients are first calculated over a whole image. Then the coefficients are classified into N 
sets, each of which is related to each element of V[1,...,N]. The rule of classifying DCT coefficients is predetermined, 


4 


EP 0 967 803 A2 


and the same rule is used in both inserting and detecting a watermark. 

[0035] Before inserting a watermark into an image, a detection algorithm is applied to the image to find a watermark 
that is already present in the image. If the image does not contain a watermark, the extracted values, V[1,...,N] will be 
normally distributed random numbers that do not correlate any watermark W[1 , .. . , NJ. A watermark W[1 , . . . , N] is inserted 

5 into an image by changing each of D_k[1 n_k] (k-i,...,N) slightly in order to make the extracted value V[1,..., N] highly 

correlate the watermark W[1 N]. Let the target value of V[1 N] be denoted by o[1 ,...,N], that is, the values V[1 ,...,N] 

are changed to a>[1,...,N] by inserting the watermark W[1 N]. The target values co[1 N] have high correlation with 

the watermark W[1 N] and they are determined as will be described below. 

[0036] After the target values ©[1 ,...,N] are determined, the difference co[k]-V[k] is distributed among the DCT coef- 
10 f icierrts DJ<1 ,...,n_k]. Then a watermark is inserted by adding the allocated difference value to the corresponding DCT 
coefficients D_k[1 l ...,n_k]. This change of DCT coefficients must be done in such a manner so as not to change the 
appearance of the image. 

[0037] In distributing the difference oo[k]-V[k] > a characteristic of the human visual process is taken into account. The 
amount of change that does not cause a visible change in the image is different for each DCT coefficient D_k[i]. This 
15 amount depends on the human visual process which can be approximately simulated with a computational model. The 
amount of change is referred to as "slack". The slack is calculated for each DCT coefficient and it is used in distributing 
the difference co[k]-V[k] among the DCT coefficients. Next we describe how to calculate the values of the slack using a 
model of the human visual process. 

[0038] The preferred computational model of human visual sensitivity that is used in the present invention is found in 
20 an article by Andrew B. Watson, entitled "DCT Quantization Matrices Usually Optimized for Indirect Images" in SPIE, 
vol. 1913 (1 993), pp. 202-21 6. This model was applied to watermarking in an article by Christine I. Podilchuk and Wen- 
jun Zeng entitled "Digital Image Watermarking Using Visual Models", Proc. of EP97, vol. 3016, February 9-14, 1997. 
The current invention differs from that of Podilchuk and Zeng in (i) not requiring the original unwatermarked image at 
the decoder and (ii) not extracting the watermark from the individual 8x8 blocks, but from the sum of a set of 8x8 blocks. 
25 Other computational models are also usable. 

[0039] For each element of the image's block DCT, d[i,j], this model computes a value called the element's "slack", 
S[i,j], which indicates how much a particular d[i,j] value may be altered before such an alteration becomes visible. The 
value is computed in three steps. The first step models the contrast masking phenomenon of the human visual system 
and models the visual sensitivity at different frequencies and handles the difference between visual sensitivity to 
30 changes in different frequencies. The second step models the luminance masking phenomenon of the human visual 
system and handles the fact that the visual system is more sensitive to changes in dark regions than to changes in 
bright regions. The third step handles the fact that the sensitivity to changes depends in part on the percentage that the 
frequency is changing (i.e. a DCT term with a small value in it may only change a little, while one with a larger value 
may change more). 

35 [0040] The perceptual model makes use of a matrix of values that indicate the relative sensitivity of the human visual 
system to the different terms of a spatial 8x8 DCT. The formulae for computing this matrix are available in an article by 
Albert J. Ahumada Jr. and Heidi A. Peterson entitled "Luminance-Model-Based DCT Quantization for Color Image 
Compression", in SPIE, vol. 1666, (1992) pp. 365-374. 

[0041] After computing the slacks for all the 8x8 DCTs in the image, a slack can be assigned to each D_k[1 ,...,n_k]. 
40 Call these slacks S„k[1 ,...,n_k]. It is now possible to distribute the changes in the V's over all the D_ks with minimal 
visual impact This is done according to the following formula: 

45 £SJc[/]F_/c[/] 

where D'_k[1 n_k] are the modified 8x8 DCT coefficients, and co[k] and V[k] are the /c-th elements of © and V, respec- 
tively. The effect of this formula is to distribute the desired change in a given element of the watermark vector 

so (ca[k] - v[k] ) over all the DCT coefficients that are summed at calculation of V[k] to produce that element, proportion- 
ately according to those DCT coefficients' slacks. To illustrate, consider two simple examples: 1) If all the slacks are 0 
except for slack S.kim], then the sum of all the values S_k[j]F_kO] is equal to S_k[m]F_k[m], and only D_k[m] is 
changed. It is changed by the full value of (<o[k]-V[k] ). 2) If all the slacks are equal and all the coefficients F_kp]=l , then 
each D_k[i] is changed by the same amount. 

55 [0042] After making these changes, convert all the 8x8 DCTs back into the spatial domain, and the result is a water- 
marked image. It is easy to show that the sum of all the D_k[i] F__k[i]'s for a given k will equal co[k]. The process of making 
this is referred to as "inserting Omega into the image". The watermark extracted from the resulting image, if the image 
has not been attacked, will be exactly ©, not © plus noise. 
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[0043] There are two important issues remaining to be discussed. First, how to decide on co, and, second, how to 
make the watermark robust. 

[0044] Previously, the equivalent of co was computed as: 

5 CO = V + Ct*W 

where a is a small constant, and W is a zero-mean watermark signal. It is possible to use the same formula here, but it 
is too limiting to result in the strongest possible watermark using the present invention. In practicing the invention, it is 
often possible to insert an co that has perfect correlation with the watermark, W, without causing any visible change in 
10 the image. The following formula is used: 

co = mean(V) + p*(V - mean(V)) + a*W 

[0045] This result is a weighted sum of the watermark signal and the original, noise (image) signal. If (3 is set to 0, the 
is result is an co that perfectly correlates with W. 

[0046] The signal to noise ratio (SNR) for an unattacked image will be: 

SNR = a*std(W)/p*std(V) 

20 where std(X) is the standard deviation of X. 

[0047] There are many ways to choose a and p based on optimizations to maximize different criteria such as fidelity 
or robustness. 

[0048] At this point there is a complete method of inserting watermarks. The method contains explicit modeling of 
human vision, but it does not contain any explicit method of making the watermark robust. In fact, the method as 
25 described so far will try to put as much of the watermark as possible into the high frequencies since these frequencies 
have the largest slack, but this is a poor thing to do from the point of view of robustness. 

[0049] To make the watermark robust against a given set of attacks or signal degradations, it is first necessary to con- 
sider how those attacks affect the various terms of the 8x8 DCTs in the image. Then, terms that are affected by attacks 
or signal degradations in similar ways are grouped together, and watermarked as if the group of terms were a separate 
30 image. 

[0050] The following is a simple example. Suppose there is only concern about two possible attacks: cropping 24 col- 
umns of pixels from the left side of the image, or cropping 24 columns of pixels from the right side of the image. This 
results in three groups of DCT terms: those that come from the 3 left-most columns of 8x8 DCTs blocks, those that 
come from the 3 right-most columns, and those that come from the rest of the image. All the terms in each of these 

35 groups either survives or is destroyed by any given attack together. If each group is watermarked as though it were a 
separate image, then the watermark from at least one group will generally survive attack (assuming that the 24-column 
cropping attacks are the only attacks possible), and the watermark that is extracted will consist of the correct water- 
mark, from that group, plus some noisy watermarks from groups that were damaged by the attack 
[0051] A more interesting example is low-pass and high-pass filtering attacks. It is possible to group all the low fre- 

40 quencies together into one group, and all the high frequencies into one or more other groups. If the predetermined rule 
for classifying DCT coefficients into the N sets is designed in such a way that each set has coefficients of many different 
frequencies, then the complete watermark can be inserted into each group. Then, if the high frequencies are removed, 
the watermark will still be detectable in the low frequencies, and vice-versa. 

[0052] The more groups the terms are divided into, the more robust will be the watermark. There is a cost because it 
45 will become increasingly more difficult to distribute the changes without causing visible distortion. 

[0053] It is important to note that the best balance can be achieved after the detectors are in wide use. It is possible 
to modify the insertion algorithm to make watermarks robust against a wide variety of attacks without having to change 
detection at all. 

[0054] Presently only three groups are used. One group collects together most of the low frequencies. Each of the 

so other two groups represents one higher frequency. 

[0055] In the preferred method, a watermark is not placed in any of the higher frequencies. The reason is not that they 
are susceptible to attack (that is handled by the design of the filter used in detection). Rather, it is because watermark- 
ing the higher frequencies causes MPEG compression rates to go down substantially. An alternative solution might be 
to add other groups that contain higher frequencies. 

55 [0056] In Figure 1 , there is shown a flow diagram of the watermark insertion method. The digital image is divided into 
a collection of nxn blocks, preferably 8x8 blocks, in step 10. The discrete cosine transform (DCT) of each block is com- 
puted in a known manner in step 12. The DCTs are separated into groups that respond to different attacks in the same 
manner in step 14. A first group G is selected in step 16. 
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[0057] Next, extract a watermark V, using only the DCT terms in the group G in step 1 8. Determine a new signal (tar- 
get value) co selected such that co is similar to V but is highly correlated with watermark W in step 20. 
[0058] Add fractions of o-V to the DCT terms in G according to perceptual slack in step 22. Decide whether group G 
is the last group in step 24. If not select next group G in step 26, and extract watermark V using only terms of next group 
5 Gin step 18 and continue procedures until the last group G is found in step 24. Next, compute the inverse DCTs of the 
blocks in step 28 resulting in a watermarked image. 

[0059] Some alternative steps in the insertion method are possible. For example, the distribution of the difference 
between to and V over DCT terms can be done stochastically to help deter tampering and reduce susceptibility to tam- 
pering. 

10 [0060] Also, the groups of DCT terms for robustness purposes could be performed dynamically. For example, the 
insertion program could simulate various attacks on the image and determine the effect on the values of the DCT term. 
Then, the program would cause appropriate allocation of the terms into the groups similarly affected. 
[0061 ] The inserter can be designed with a user-interface that allows the user to set two parameters: (1) the maximum 
perceptual difference (J) between the original image and the watermarked image and (2) the maximum allowable prob- 

15 ability of missed detection after any of the predefined set of attacks. The algorithm would then insert watermarks into a 
large number of images automatically, according to the allowable perceptual change J and checking each one against 
simulations of the attacks. If an image fails to meet the specified robustness constraint (maximum allowable probability 
of missed detection), then the user would be notified so that a manual decision can be made to compensate or trade- 
off image fidelity for robustness. 

20 [0062] In addition, the distribution of the difference between <o and V over DCT terms can be modified to explicitly 
compensate for MPEG quantization. Using the above watermark insertion method there may result a degraded water* 
mark in the watermarked data, in order to enhance the watermark in the watermarked data after MPEG compression 
several techniques are possible. 

[0063] Figure 2 is a schematic diagram of a typical MPEG-2 encoder. Figure 2 depicts elements which are indispen- 

25 sable to execute an MPEG-2 encoding of P pictures, or to perform a combined interframe prediction and DCT coding. 
Input images are provided as one input to subtracter 30. The other input to subtractor 30 is predicted image generated 
in frame memory 32. The predicted images are subtracted from the input images at subtractor 30. A discrete cosine 
transform (DCT) is performed at DCT calculator 34 on the output signal from subtractor 30. The DCT coefficients are 
quantized in quantizer 36. The outputs of the quantizer 36 are sent to a variable length encoder 38 where Huffman 

30 encoding is performed. The quantized DCT coefficients outputted from the quantizer 36 are also sent to an inverse 
quantizer 40 where they are de-quantized. Inverse DCT of the de-quantized DCT coefficients is performed in the 
inverse DCT calculator 42. The results are added at adder 44 to the predicted image outputted from the frame memory 
32, and then an image which is expected to be the same as that acquired in a decoder is reconstructed. The recon- 
structed image is called "a locally decoded image." This locally decoded image is stored in the frame memory 32 to pro- 

35 duce the predicted images. 

[0064] Figure 3 is a schematic diagram of a modified MPEG-2 encoder for reducing degradation of a watermark in 
watermarked data. Before MPEG-2 encoding, DCT is performed on a input image at the DCT calculator 50 and water- 
mark signals are added to the DCT coefficients at adder 52. The output DCT coefficients including watermark informa- 
tion is subject to inverse DCT in the inverse DCT calculator 54. The output of inverse DCT calculator 54 are images with 

40 a watermark. These watermarked images are sent to MPEG-2 encoder and MPEG-2 encoding is performed as 
described above. In addition, in this embodiment, watermark information is modified in order to be suited to MPEG-2 
compression. DCT coefficients for the predicted images are calculated in DCT calculator 56. The quantization values 
outputted from quantizer 36 are de-quarrtized in inverse quantizer 58. The results of inverse quantization are added at 
adder 60 to the DCT coefficients outputted from DCT calculator 56. The results of addition correspond to the DCT coef- 

45 f icients for the decoded images which are expected to be generated in a decoder. These DCT coefficients are inputted 
into a watermark correction device 62. The watermark correction device 62 outputs watermark correction signals. At 
adder 64, the watermark correction signals from device 62 are added to the quantization values from quantizer 36. The 
output of adder 64 is used as the inputs to variable length encoder 38 and inverse quantizer 40. 
[0065] Next, we describe the process performed in the watermark correction device 62. Let us introduce several new 

so notations to explain the process. Let Dq_kfj] be the quantization value corresponding to D_k[i], that is, the quantization 
value of the /-th member of the /c-th set for calculating the value V. Let Q_k[i] be the quantization step size used in 
obtaining Dq_kp]. Let Dr_k[i] be the output value of adder 60 that is obtained by adding the inverse quantization value 
of Dq_k[i] calculated in inverse quantizer 58 to the corresponding DCT coefficient outputted from DCT calculator 56. Let 
Vr[1 ,...,N] be the value extracted from the output values of adder 60, Dr_kp], in the same manner that the value V[1 N] 

55 is calculated in inserting a watermark. We assume that the target value a>[1 ...,N] is also available in the watermark cor- 
rection device 62. 

[0066] Figure 5 shows the flow chart describing the process performed in the watermark correction device 62. First 
the index k of the watermark element is set to 1 at step 500. Next the value Vr[k] is calculated at step 502 by 
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Vr[k] = F_k(1]Dr_k(1] +...+ F_k[n_k]DrJ4n_k], 


where the weighting coefficients F_k[1 ,...,n_k] are the same value as those used in calculating V[k]. Then the absolute 
value of the difference between the value Vr[k] and the target value <a[k], and the sign of the difference are computed in 
5 step 504 by the following equations: 

Dif = |Vr[kJ-©[k]| 
s = Sign(Vr[k]-©[k]), 

where 


10 


15 


(x>0) 
(x<0) 


This value Dif corresponds to the distortion of the watermark inserted at the adder 52 generated in the quantization 
process. 

20 [0067] On the basis of the absolute value Dif and the sign S watermark correction signals are generated in step 506. 
The process in step 506 is described later. After the generation of the watermark correction signals for the DOT coeffi- 
cients related to the /r-th element of the watermark, index k is compared with N in step 508. If k > N, then the process 
is finished. If k <, N, the value k is increased by one in step 51 0, and the process goes back to step 502. The watermark 
correction process is thus performed, and the obtained watermark correction signals are finally outputted to adder 64. 

25 [0068] Next we describe the process performed in step 506 in Figure 5, using the flow chart in Figure 6. In step 506, 
the watermark correction signals for the DCT related to the k-th element of the watermark are generated. The array 

ADq_kp] (i=1 n_k) are first all set to zero in step 520. Next, the value j is set to one in step 522. Then the index i of 

the DCT coefficients is found through a permuting function p(j) in step 523. The function p(j") returns the y'-th value of a 
permutation obtained after the integers 1 to n_k are permuted. The simplest example is pQ>j . Next, a value -s is stored 

30 in ADq_k[i] f and the value Dif is decreased by QJ<i]F_k[i] in step 524. This indicates that the quantization value Dq_k[i] 
is changed by -s by adding ADq_kfi] to Dq_k[Q at adder 64. The value s is 1 or -1 , so the change in the quantization 
value is one. In other words, de-quantized value obtained in a inverse quantizer in a decoder is changed by -Q_k[i], one 
step size. The value Dif after the update is identical to the absolute value of the difference between co[k] and Vr[k] cal- 
culated with the conected quantization values Dq_kp] + ADq_k[i] 0=1 ,...,n_k). After step 524, the values Dif and zero, 

35 and index j and n_k are compared in step 526. If Dif < 0 or j>n_k then this subroutine is finished. If the condition is not 
satisfied, the index j is increased by certain amount Aj in step 528, then the process returns to step 523. 
[0069] Instead of the process shown in Figure 6, an alternative process shown in Figure 7 can be used as the process 
of step 506. In the process shown in Figure 7, a step 530, checking whether the quantization value Dq_k[i] equals zero 
or not is added between step 523 and step 524. In this case, step 524 is performed only if the quantization value 

40 Dq_kp] is a non-zero value. This allows a reduction in the increase of the number of bits caused by correcting water- 
mark information, because changing a quantization value from zero to non-zero value generally results in a large 
increase in the number of bits. 

[0070] Figure 4 is a schematic diagram of an alternative embodiment of a modified MPEG-2 encoder for reducing 
degradation of a watermark in watermarked data. In this embodiment, the basic concept is the same as the described 

45 in connection with Figure 3. The differences lie in the fact that subtraction of the predicted images from the input original 
images is performed in the DCT domain not in the spatial domain. For the predicted image outputted from the frame 
memory 32, DCT is performed in DCT calculator 70, and the results are subtracted from the watermarked DCT coeffi- 
cients at subtracter 72. The results of subtraction are sent to the quantizer 36 and then the watermark correction is per- 
formed in the same manner as shown in Figure 3. The results outputted from subtracter 72 are the same as the results 

so outputted from the DCT calculator 34 in Figure 3 because of the linearity of DCT. Therefore, the results obtained in the 
processes followed by quantization in quantizer 36 are the same as those in Figure 3. This embodiment results in a 
reduction in the number of DCT calculations. 

[0071 ] The detection procedure to detect a watermark in an image will now be described. 

[0072] If MPEG video is the input image data format, the following detection process determines whether watermark 
55 W is present, where W[1,...,N] is the watermark being tested for. 

[0073] Decode the Huffman code, but do not compute the inverse DCTs, so that, for each frame (at least, each I- 
frame), there is an array of 8x8 DCTs. 

[0074] Next perform the same summation of DCT coefficients that was performed during watermark insertion to 
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obtain the vector V. Compute the correlation coefficient C, between V and the watermark being tested for, W: 

V=V-V 

s W'=W-W 

10 

[0075] Finally, convert C into a normalized Fisher Z statistic: 

15 

where N is the length of the watermark 

[0076] The Z value indicates whether the watermark is present. A preferred threshold for Z is 4 (i e. Z £ 4 means the 
watermark is present), but other values may be used depending on the desired probabilities of false alarms and missed 
20 detections. 

[0077] Figure 8 is a flow diagram of the detection method for MPEG video input described above. The input MPEG 
video is subject to a Huffman decoder and partial parser 80 where the output is a set of DCT for nxn, preferably 8x8, 
blocks of the video input 

[0078] The nxn DCTs are provided to watermark accumulator 82. Accumulator 82 has memory whose length is the 
25 watermark length N. DCT coefficients from the Huffman decoder and partial parser 80 are classified according to a pre- 
determined rule and summed for extracting a watermark as mentioned before, and the results are accumulated in the 
memory. The extracted watermark is provided to comparator 84 where it is compared with possible watermarks in the 
image by calculating correlation coefficients between the extracted watermark and the possible watermarks as men- 
tioned before. The possible watermarks are the universe of the watermarks that might have been inserted into the video 
30 data. The details of the accumulators and comparator are found, for instance, in U.S. Patent Application Serial No. 
08^746,022. 

[0079] The output of comparator 84 is the likelihood (normalized Fisher Z statistic) of the detected watermark being 
each of the possible watermarks. The most likely watermark is determined and is deemed the watermark in the image, 
or, if the detector does not exceed a predetermined threshold, then no watermark is present. Alternatively, if the incom- 

35 ing input data comprises an uncompressed image, an embedded watermark can be detected by applying the method 
above to DCT coefficients obtained by performing 8x8 DCT for the whole image. In this case, DCT have to be performed 
for each 8x8 block, but a skillfully designed rule for classifying DCT coefficients into N sets enables us to avoid perform- 
ing DCT many times. Before explaining the method to reduce DCT calculation, let us define some notations. 
[0080] Let h_m(i,j) (i-1 8, j=1 ,...,8, m=0,...,M-1) be a set of functions that map frequency indices of 8x8 DCT coef- 

40 f icients (i, j) onto the indices k of the element of a watermark, and M indicates the number of functions. So if k=h_m(i,j), 
then a DCT coefficient whose index is (i,j) is classified into the/c-th set of DCT coefficients for calculating the value V[k]. 

We prepare M different functions h_m(i j) (m=0 M-1). Which function is selected for a certain 8x8 block depends on 

the numbers r and c of the block where r and c indicate the row and column numbers of the block respectively. So the 
index of the functions, m, is first found according to the values r and c, then the index of the sets, k, is determined by 

45 h_m(ij) for each DCT coefficient. Using these functions hjnfl j) in classifying DCT coefficients is assumed in the 
remaining part of the detailed description. 

[0081 ] In this case, we can reduce the number of DCT calculations in the following manner. First the sum of the blocks 
whose indices m are the same is computed for each m=0 ( ...,M-1. Let this summed block be denoted by VB_m[i,j] 
(m=0,...,M-1). Then DCT is performed for the M summed blocks VB_mp,j] (m=0,...,M-1). Finally, the DCT coefficients of 

so the summed blocks are classified into N sets according to the value h_m(l,j), and added together within each set to 

obtain V[1 NJ. The obtained resufts V[1 N] are the same as V[1,...,N] obtained with the method mentioned above 

because DCT is a linear transform, that is, the sum of DCT blocks equals the result of DCT for the sum of the blocks. If 
M is much less than the total number of blocks in an image, the number of DCT calculations is dramatically reduced. 
This method thus allows us to extract watermarks with small calculation cost. 

55 [0082] Figure 9 shows a flow diagram of the detection method for uncompressed video input data as described above. 
[0083] The uncompressed video image data is provided to nxn accumulators, preferably 8x8 accumulators 90. The 
memory requirement is n 2 times the number of the function h_m(i,j). For each index m, the blocks with the same index 
m are summed and the resultant M summed blocks are accumulated in the memory. 
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[0084] The output is the summed signal of each of the nxn blocks. The output is subject to a DOT transform 92. The 
number of transformations is proportional to the number of the functions h_m(i,j). The result is a group of nxn DCTs 
which is classified into N sets according to the functions h_m(i,j) and summed for extracting a watermark as mentioned 
before. The obtained watermark is provided to watermark accumulator 94 and accumulated. The memory requirement 

5 for accumulator 94 is proportional to the watermark length. The extracted watermark is provided as input to comparator 
96. The other inputs to comparator 96 are the possible watermarks that may have been inserted into the input image 
data. The comparator computes a likelihood (normalized Fisher Z statistic) of each possible watermark having been 
inserted into the image data. The most likely watermark is determined and is deemed the watermark in the image. 
[0085] A limitation of block based DOT methods is their sensitivity to spatial shifts of the image. For example, if the 

10 image is shifted two pixels to the right, then the DCT coefficients change significantly, so that the watermark cannot be 
detected. Furthermore, general distortions, such as scaling and rotation, also make the watermark undetectable. 
[0086] To solve these problems, the above insertion and extraction methods may be modified in two ways. The first 
possible modification is to insert multiple watermarks designed to survive predefined distortions of the video. The sec- 
ond modification is to arrange that translations can be compensated for without performing the summation more than 

is once. Optionally, this second modification may be further modified to insert registration patterns, one for each of the 
multiple watermarks, which can be used by a modified watermark detector to compensate for arbitrary translations of 
the video. 

[0087] When detecting watermarks inserted using the above method, it is necessary to divide the image into the same 
grid of nxn blocks as was used during insertion. If the image has been translated since watermark insertion, then deter- 

20 mining the correct grid becomes difficult. In many applications, this is a serious problem, since certain, specific trans- 
formations can be expected. For example, it can be expected that video on a DVD disk might be modified to fit on a 
standard television screen by conversion to either "panscan" or "letterbox" mode. In "panscan" mode, the horizontal 
image resolution is increased, and the image is cropped at a predetermined offset, so that the resulting image will be 
correct when viewed on a 3x4-aspect-ratio television screen. In "letterbox" mode, the image is scaled vertically, and 

25 black is added at the top and bottom, so that the whole image will fit correctly on a 3x4-aspect-ratio screen. Since these 
two geometric transformations are more likely than any other, it is reasonable to prepare for them specifically. 
[0088] The problem of the predetermined scaling or transforming of watermarked video is solved in the present inven- 
tion by inserting an additional watermark for each of the likely transformations. Each of these watermarks is designed * 
so that, when the image has undergone the corresponding known transformation, the grid of nxn blocks used during 

30 insertion will align with a predetermined grid used during detection. Thus, if the image has undergone no transforma- 
tion, then the detection grid will align with the normal mode watermark, and the normal mode watermark will be 
detected. If the image has undergone "panscan" transformation, then the same detection grid will align with the "pans- 
can" watermark, and the "panscan" watermark will be detected, and so forth for "letterbox" scan or any other predefined 
transformation. 

35 [0089] The procedure for inserting a watermark that is to be detected after a specific transformation comprises the 
following steps: 

1. Make a copy, l T , of the image being watermarked, I, and apply the transformation to be compensated for the 
image. For example, l T might be a copy of I that has been transformed into "letterbox" mode by vertical shrinking of 

40 I. 

2. Create a watermarked version of the transformed image, lj\ according to the general watermarking method 
described above. 

3. Let W T = I T '-l j be the spatial pattern that was added to l T when it was watermarked. 

4. Perform the inverse transformation on W T to yield the corresponding watermark pattern, W, for the untrans- 
45 formed image. For example, if the transformation to be compensated for was "letterbox" mode, then W would be 

obtained by vertically expanding W T . 

5. Let r = I + W be the image with a watermark added for the given transformation. 

[0090] When the transformation is applied to the watermarked image P, the result will be approximately l T \ and the 
so watermark will be detected by the same procedure designed to detect a normal watermark. 

[0091] This process can only be used for a small number of transformations, as each additional watermark causes 
additional degradation of the image, and reduces the detectability of other watermarks in the image. However, tests 
have shown that three watermarks -two for transformed images and one for the untransformed image-result in accept- 
able fidelity and good detectability. Alternatively, for video, each watermark can be inserted in a time-multiplexed man- 
55 ner. 

[0092] In cases where the transformations that an image will undergo are not predefined or predetermined, or where 
there are too many probable transformations to allow for the insertion of a separate watermark for each transformation, 
the above described method of compensating for transformations is not optimal. Thus, the present invention includes 
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an additional improvement, which compensates for arbitrary translation of the image between the times of watermark 
insertion and watermark detection. 

[0093] Arbitrary translation is compensated for in two ways. One way is by translations by even multiples of 8 pixels 
in the x or y directions when 8x8 blocks are used. This can be easily compensated for if the following restrictions are 
5 imposed on the relationship between an index of function h_m(i,j), m, and the row and column numbers of blocks, r and 
c. We determine the index m according to 

m = f(r ( c) mod M, 

w where f(r,c) is a linear function of r and c, and f(0,1) and f(1 ,0) are integers. 
[0094] In addition, let h_m(i,j) be a function expressed as 

h_m(i.fl = (h0(i,j) + axm) mod N, 

15 where a is an integer. These restrictions are assumed in the remaining part of the detailed description. In this case, the 
shift compensation is performed in the basic detection algorithm by computing the correlation of the extracted water- 
mark with all cyclical shifts of the watermark being tested for. This is because the values VB_mp,j] (m=0 M-1) 

obtained from a watermarked image shifted by a multiple of 8 pixels in the horizontal and/or the vertical directions are 
identical to the values VB_m[i,j] (m=0,... t M-1) obtained from the non-shifted image except that the indices m are cycli- 

20 cally shifted. As a result, the extracted value V[1 ,...,N] obtained from a watermarked image shifted by a multiple of 8 are 

identical to the value V[1 N] obtained from the non-shifted image except that the elements V[k] are cydically shifted. 

For nxn grid format, the same thing is true of a shift of a multiple of n in the horizontal and/or the vertical directions. 
[0095] For shifts of less than 8 pixels in x and/or y directions, an exhaustive search can be performed of all 64 possi- 
bilities and the maximum Z value taken from the set of 64xM - Z values, where M is the number of 8x8 accumulators. 

25 In our tests M was chosen to be 64. The factor M is necessary to account for the cydic shifts that are introduced by 
shifts of a multiple of 8 pixels. 

[0096] The exhaustive search requires shifting the M 8x8 accumulator array in the spatial domain and then performing 
the DCT of each of the M 8x8 blocks. This is performed 64 times, once for each of the possible shifts. Thus it is neces- 
sary to perform 64xM 8x8 DCTs. If this computation is too expensive in terms of time or memory an alternative method 

30 can be used, as described below. 

[0097] The second way, which compensates for translations of non-even multiples of n pixels, uses a pattern (referred 
to as a "registration pattern") which can be inserted at the time of watermark insertion. By finding the location where the 
registration signal best matches a predefined signal, a detector can determine how much to shift the data before 
extracting the watermark. This shifting must be done in the spatial domain, but can be done with accumulators, so con- 

35 versions of whole images are avoided. 

[0098] Moreover, the 64xM 8x8 DCTs are unnecessary. Instead, the correct registration is determined in the spatial 
domain and then compensated for by shifting the pixels in the accumulator arrays. The M 8x8 accumulators are only 
then transformed into the DCT domain and the watermark extraction is performed as described above. 
[0099] The registration pattern is an 8x8 spatial pattern inserted into the image in such a way that the sum of all 8x8 

40 pixel blocks highly correlates with the pattern. Again, an nxn spatial pattern is used when the video is nxn blocks. A reg- 
istration pattern can be inserted by using the watermark insertion method described above. The sum of all 8x8 pixel 
blocks becomes highly correlated with a registration pattern if the DCT coefficients of the sum block and those of the 
registration pattern are highly correlated to each other. In addition, the DCT coefficients of the sum block equals the 
sum of all DCT blocks because of the linearity of DCT. We can thus insert a registration pattern using a method similar 

45 to that inserting a watermark described above, considering the DCT coefficients of a registration pattern as a water- 
mark W and considering the sum of all 8x8 DCT blocks as a value V. 

[0100] Insertion is performed by converting the registration pattern into the DCT domain, and then using the basic 
insertion algorithm with different DCT coefficients D^s and filter coefficients Fj<s. Specifically, each AC term of the reg- 
istration pattern's DCT is considered one element of the watermark, W[k]. The set of DCT terms that are summed 
so together to extract this element, D k , is simply the set of corresponding terms of all the 8x8 DCTs in the image. All the 
F|<s are set to 1. Using these and F^, the insertion algorithm inserts a registration pattern along with each water- 
mark. The watermarks are still inserted with the original D k s and F k s. 

[0101] During detection, registration is performed as follows. Each of a predetermined number of blocks is summed 
together to form a single nxn, typically 8x8, block. We have arbitrarily used 64 blocks in our tests. This block contains a 
55 registration pattern placed there by the insertion process. To determine the horizontal and vertical translation of the 
frame, a correlation process is performed in the spatial domain to determine these offsets. Sixty-four correlations are 
performed for each of the 8 horizontal and 8 vertical motions that are possible. When the 8x8 patch is shifted either hor- 
izontally or vertically, a wrap around shift is performed. 
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[01 02] We now describe how to determine the shift of the grid. In the following method, we assume the integer values 
f (0, 1 ) and f (1 ,0) are relatively prime to M. 

[01 03] First, the blocks for which the same function h_m(ij) is used in classifying DCT coefficients are added together 
in the spatial domain for generating a summed block VB_m[ij] for each m. The sum of all nxn blocks denoted by ABpjJ, 
s is computed by 

M-1 

ABlijIm^VB^VJl (M n,y-1 n). 

10 

[01 04] Then the correlation coefficient between AB[i, j] and a registration pattern, denoted by R[t.j], is computed. After 
calculation of the correlation coefficient, the values AB[i,j] are cyclically shifted by one column in the horizontal direction, 
and the correlation coefficient between AB[ij] and R[i,j] is calculated in the same way. The same operations are 
is repeated for each shift After shifting n times, AB[i,j] becomes identical to AB[i,j] before any shift is done. Then, the val- 
ues ABpj] are cyclically shifted by one row in the vertical direction, and calculation of a correlation coefficient and shift 
by one column in the horizontal direction are repeated. In this way, we can calculate correlation coefficients for all n 2 
possible shifts. At the same time, we search the shift value (offset), denoted by (X,Y), which gives the maximum corre- 
lation coefficient. 

20 [01 05] After the-offset (X,Y) has been determined, the M summed blocks VB_m[i,j] are then shifted accordingly in the 
spatial domain. 

[01 06] Next, we describe the method to compensate for the shift value X in the horizontal direction. To do so, the val- 
ues VBjnpjJ (m=0,... ( M-1) are first copied on an nxnM array VB1[i,j] (i=1 n, j=1,...,nM). In copying the values 

VB_m[i,j], the spatial relationships between blocks, that is, which blocks are adjoining a certain block are considered. 
25 The function f(r,c) is linear, so 

f(r,c+1) = f(r,c) + f(0,1). 
[01 07] This indicates that the blocks whose index m equals 

30 

ml = (m0 + f(0,1))modM 

are located next to the blocks in which m=mO. So, the values VB_m1 are copied next to the values VB_m0 in the array 
VB1 . For this reason, for each I (l=0,...,M-1), the values VB_m'[i,j] (i=1 ,...,n, j=1 n, m =lxf(0,1) mod M) are copied in 

35 the nxn square region of VB1 where the index of the left-top corner is (1 , nl+1). After copying the values for all I, the 
array VB1 is filled with the values VB_m[ij] because 1(0,1) and M are relatively prime. Next, the values in the array VB1 
are cyclically shifted by X in the horizontal direction. Then the values in VB1 are returned to VB_m in such a way that 
the value in the nxn region of VB1 where the index of the left-top corner is (1, nl+1) are substituted to VB_mTi,j] 
(i=1 n, j=1 n, m'=lxf(0,1) mod M) for each I (l=0 M-1). The horizontal offset value X can thus be compensated 

40 for without shifting the whole image itself. 

[0108] Next, the shift value Y in the vertical direction is compensated for. To do so, the values VB_m[i,j] (m=0 M-1) 

are first copied on an nMxn array VB2[ij] (i=1,...,nM, j=1 n). In copying the values VB_m[i,j] f the spatial relationship 

between blocks are considered. The function f(r,c) is linear, so 

45 f (r+1,c) = f(r, c) + f(1,0). 

[0109] This indicates that the blocks whose index m equals 

ml = (m0 + f(1,0)) modM 

50 

locate under the blocks in which m=m0. So, the values VB_m1 are copied under the values VB_m0 in the array VB2. 
For this reason, for each I (1=0,..., M-1), the values VB_m'p,j] (i=1,...,n, j=1 t ...,n, m=lxf(0,1) mod M) are copied on the 
nxn square region of VB2 where the index of the left-top corner is (nl+1 , 1). After copying the values for alt I, the array 
VB2 is filled with the values VB_m[i,j] because f(1 ,0) and M are relatively prime. Next, the values in the array VB2 are 
55 cyclically shifted by Y in the vertical direction. Then the values in VB2 are returned to VB_m in such a way that the value 

in the nxn region of VB2 where the index of the left-top corner is (nl+1 , 1) are substituted in VBjn'p.j] (i=1 ,...n, j=1 n, 

m'=lxf(0,1) mod M) for each I (l=0 t ...,M-1). The vertical offset value Y can thus be compensated without shifting the 
whole image itself. 
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[01 10] The offset of the nxn grid is compensated by the process mentioned above. These processes are performed 
in the registration process 108 (Figure 10) as will be explained later. A shift of a multiple of n remains even after these 
processes, but this shift does not affect the watermark detection because the correlation coefficient between a water- 
mark W and an extracted value V is calculated by shifting the watermark W cyclically as described above. After the reg- 
istration process above is applied, the M blocks VB_m (m=0,...,M-1) are transformed back to the DCT domain. 
[01 1 1 ] With reference now to Figures 1 0 and 1 1 , there are shown the basic detection algorithms modified to compen- 
sate for translational registration. In the case of MPEG video input (Figure 10), 8x8 DCT blocks obtained from an MPEG 
video stream are first classified into M groups according to their indices m of the function h_m(i,j), summed within the 
groups for generating M summed blocks, and the resultant summed blocks are accumulated in 8x8 accumulators 102. 
The M summed blocks in accumulators 102 must be converted into the spatial domain by performing an inverse DCT 
operation in inverse DCT converter 104, and accumulated in accumulators 106. Finding the offset value of the 8x8 grid 
and compensating for the offset is executed for the output from 8x8 accumulators 106 in registration 108 as described 
above. The registration data outputted from registration process 108 is accumulated in accumulators 110 and con- 
verted into the DCT domain in DCT converter 112 for watermark extraction by use of accumulators 114, watermark 
extractor 1 1 6 and watermark decoder 1 1 8. In watermark extractor 1 1 6, the DCT coefficients outputted from accumula- 
tor 114 are classified into N sets according to the functions h_m(i,j) and summed for extracting a watermark. The 
obtained watermark is provided to watermark decoder 1 18, in which the processes executed in comparator 84 in Figure 
8 for finding a watermark corresponding to the extracted watermark. The watermark considered to have been inserted 
is outputted from the watermark decoder 118. In the case of uncompressed input data (Figure 1 1), the input data is 
divided into 8x8 blocks and accumulated in accumulators 1 06 according to the indices of the functions h_m(i,j), and reg- 
istration 108 is performed before conversion into the DCT domain in DCT converter 112. The process continues as 
described above. 

[0112] While there has been described and illustrated methods of insertion and detection of watermarks in image 
data, it will be understood by those skilled in the art that variations and modifications are possible without deviating from 
the spirit and broad teachings of the present invention which shall be limited solely by the scope of the claims appended 
hereto. 

Claims 

1 . A method of inserting a watermark signal into data to be watermarked comprising the steps of: 

applying a watermark extraction to unwatermarked data for generating extracted values; 
computing the difference between a watermark to be inserted and the extracted values; and 
adding the computed difference throughout the unwatermarked data. 

2. A method of inserting a watermarked signal into data to be watermarked as set forth in claim 1 , wherein said adding 
the computed difference comprises distributing the computed difference according to a perceptual model of human 
visual sensitivity. 

3. A method of inserting a watermarked signal into data to be watermarked as set forth in claim 1 , wherein said adding 
the computed difference comprises distributing the computed difference according to how affected the data will be 
to a predetermined attack so that the attack will not affect the watermarked data. 

4. A method of inserting a watermarked signal into data to be watermarked as set forth in claim 1 , wherein said adding 
the computed difference comprises distributing the computed difference according to a stochastic method to 
reduce the susceptibility to tampering. 

5. A method of inserting a watermarked signal into data to be watermarked as set forth in claim 1 , wherein said unwa- 
termarked data is compressed video and said compressed video is divided into groups of blocks, where said com- 
puting the difference computes the difference between a watermark and the sum of the extracted values from each 
group of blocks and said adding distributes the difference into the group of blocks. 

6. A method of inserting a watermarked signal into data to be watermarked as set forth in claim 5, wherein said groups 
of blocks are selected according to response to predetermined attacks and signal degradations. 

7. A method of detecting a watermark from watermarked data comprising the steps of: 

receiving watermarked data where the watermarked data comprises nxn blocks of watermarked data; 


13 


EP 0 967 803 A2 


summing the nxn blocks of watermarked data to form at least one nxn block of summed watermarked data; 
cyclically shifting the block in the spatial domain to register the block; and 
extracting the watermark from the at least one nxn block. 

5 8. A method of detecting a watermark from watermarked data comprising the steps of: 

receiving watermarked data where the watermarked data comprises nxn blocks of watermarked data; 

summing the nxn blocks of watermarked data to form at least one nxn block of summed watermarked data; 
10 and 

extracting the watermark from the at least one nxn block by summing predetermined watermarked data from 
each nxn block to yield each element of the extracted watermark. 

is 9. A method of detecting a watermark from watermarked data as set forth in claim 8, wherein said extracting is per- 
formed in the spatial domain using at least one nxn blocks. 

10. A method of detecting a watermark from watermarked data as set forth in claim 8, wherein said extracting is per- 
formed in the frequency domain using at least one nxn blocks. 

20 

1 1 . A method of detecting a watermark from watermarked data as set forth in any of claims 8 to 10, wherein each nxn 
block is filtered before said extracting. 

12. A method of detecting a watermark from watermarked data comprising the steps of: 

25 

receiving watermarked data where the watermarked data comprises nxn blocks of watermarked data; 
summing the nxn blocks of watermarked data to form M nxn blocks of summed watermarked data, where M is 
the number of functions mapping frequency indices onto the indices of elements of watermarks; and 
extracting the watermark from the at least one nxn block. 

30 

13. A method of detecting a watermark from watermarked data as set forth in claim 12, wherein said forming is per- 
formed in the spatial domain. 

14. A method of detecting a watermark from watermarked data as set forth in claim 12, wherein said forming is per- 
35 formed in the frequency domain. 

15. A method of inserting a watermark signal into video images comprising the steps of: 

receiving input video images; 

40 performing discrete cosine transformation (DCT) of said input video images to obtain input video image DCT 

values; 

adding watermark signals to said input video image DCT values to obtain watermarked DCT values; 
performing an inverse DCT on the watermarked DCT values for generating watermarked images; 
subtracting predicted images stored in memory from said watermarked images for generating residual images; 
45 performing discrete cosine transformation of said residual images to obtain residual image DCT values and 

quantizing the residual image DCT values to derive residual image quantized DCT values; 
inverse quantizing the residual image quantized DCT values to obtain residual image de-quantized DCT val- 
ues; 

performing discrete cosine transformation of the predicted images to obtain predicted image DCT values; 
so summing the residual image de-quantized DCT values and the predicted image DCT values for generating 

decoded image DCT coefficients; 

calculating correction signals by applying a watermark extraction to the decoded image DCT coefficients and 
comparing resultant extracted values with watermark target values; 

adding the correction signals to the residual image quantized DCT values for obtaining an output signal; 
ss inverse quantizing the output signal to obtain inverse quantized output data, perforing inverse DCT of the 

inverse quantized output signal and summing the resultant signal with the predicted images for generating a 
summed signal; 

generating predicted images to be stored in said memory by using the summed signal; and 
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variable length encoding the output signal for providing a watermarked MPEG video signal of the input video 
images. 

16. A method of inserting a watermark signal into video images comprising the steps of: 

5 

receiving input video images; 

performing discrete cosine transformation (DCT) of said input video images to obtain input video image DCT 
values; 

adding watermark signals to said input video image OCT values to obtain watermarked DCT values; 
10 performing discrete cosine transformation on predicted images stored in memory to obtain predicted image 

DCT values; 

subtracting the predicted image DCT values from the watermarked DCT values to obtain residual image DCT 
values; 

quantizing the residual image DCT values to obtain residual image quantized DCT values; 
15 inverse quantizing the residual image quantized DCT values to obtain residual image de-quantized DCT values 

and summing the residual image de-quantized DCT values with the predicted image DCT values, for generat- 
ing decoded image DCT coefficients; 

calculating correction signals by applying a watermark extraction to the decoded image DCT coefficients and 
comparing resultant extracted values with watermark target values; 
20 summing the correction signals with the residual image quantized DCT values to obtain an output signal; 

inverse quantizing the output signal to obtain inverse quantized output signal, performing inverse DCT of the 
inverse quantized output signal and summing the resultant signal with the predicted images to provide locally- 
decoded images; 

generating predicted images to be stored in said memory using the locally-decoded images; and 
25 valuable length encoding the output signal to provide a watermarked MPEG video signal of the input video 

images. 

17. A method of inserting a watermark signal into video images as set forth in claim 15 or 16, wherein said calculating 
step generates a negative value as said correction signal when the extracted value exceeds the corresponding 

30 value of said watermark target values, and generates a positive value as said correction signal when the extracted 
value falls short of the corresponding value of said watermark target values. 

18. A method of inserting a watermark signal into video images as set forth in claim 17, wherein said calculating step 
generates a zero value as the correction signal if the corresponding value of said residual image quantized DCT 

35 values is zero. 

19. A method of inserting a watermark signal into video images as set forth in claim 17 or 18, wherein said positive 
value is +1 and said negative value is -1. 

40 20. A method of inserting an extractable watermark signal into image data so that the watermark signal can be 
extracted after the watermarked data is subject to distortion comprising the steps of: 

providing image data to be watermarked; 
copying the image data; 
45 applying a predefined distortion to the copy of image data; 

inserting a watermark signal into the distorted copy of image data; 

subtracting the copy of distorted image data from the watermarked copy of distorted image data to generate a 
watermark pattern; 

performing inverse distortion on the watermark pattern to generate the watermarked pattern for nondistorted 
so image data; and 

combining said image data to be watermarked and said watermark pattern for nondistorted image data to 
obtain watermarked data. 

21. A method of inserting a watermark signal into image data as set forth in claim 20, wherein said predetermined dis- 
ss tortion is panscan mode image data. 

22. A method of inserting a watermark signal into image data as set forth in claim 20, wherein said predetermined dis- 
tortion is letterbox mode image data. 
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23. A method of inserting a watermark signal into image data as set forth in claim 20, wherein said combining includes 
combining with perceptual modeling. 

24. A method of inserting an extractable watermark signal into image data so that the watermark signal can be 
5 extracted after the watermarked data is subject to distortion comprising the steps of: 

receiving image data in form of nxn blocks; and 

inserting an nxn registration pattern into the image data. 

10 

25. A method of inserting an extractable watermark signal into image data as set forth in claim 24, wherein said insert- 
ing is performed by converting the registration pattern into the DCT domain and then inserting the DCT values of 
the registration pattern into the image data. 

75 26. A method of detecting a watermark from watermarked image data that has been subjected to a two dimensional 
translation distortion comprising the steps of: 

partitioning received distorted watermarked image data into a set of nxn blocks; 

summing together watermarked image data in a predetermined number of nxn blocks; 
20 if summed nxn blocks are not in spatial domain, transforming summed nxn blocks into spatial domain; 

determining two-dimensional translation distortion of the summed nxn blocks; 

shifting the summed nxn blocks according to the determined distortion; 

transforming the shifted nxn blocks into transform domain; 

extracting a possible watermark signal from the transformed nxn blocks; 
25 con-elating the possible watermark with a known watermark; 

repeating said correlating for cyclic shifts of the possible watermark; and 

determining the presence of a watermark based on said correlating. 

27. A method of detecting a watermark from watermarked image data as set forth in claim 26, where the two dimen- 
30 sional translation is a geometric shift in at least one of the horizontal or vertical direction. 

28. A method of detecting a watermark from watermarked image data that has been subjected to a two dimensional 
translation distortion comprising the steps of: 

35 (a) partitioning received distorted watermarked image data in nxn blocks; 

(b) summing together watermarked image data in a predetermined number of nxn blocks; 

(c) if summed nxn blocks are not in spatial domain, transforming summed nxn blocks into spatial domain; 

(d) shifting the summed nxn blocks by one of n 2 possible shifts; 

(e) transforming the shifted nxn blocks into transform domain; 

40 (f) extracting a possible watermark signal from the transformed nxn blocks; 

(g) correlating the possible watermark with a known watermark; 

(h) repeating said correlating for cyclic shifts of the possible watermark; 

(i) repeating steps (d)-(h) for each of the n 2 possible shifts; and 

(j) determining the presence of a watermark based on the results of step (i). 

45 

29. A method of detecting a watermark from watermarked image data as set forth in claim 26 or 28, wherein said nxn 
block is a 8x8 block. 

30. A method of detecting a watermark from watermarked image data as set forth in claim 26 or 28, wherein said cor- 
so relating is performed for n 2 possible shifts. 
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